Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space
نویسندگان
چکیده
The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
منابع مشابه
Identification, isolation and bioinformatics analysis of specific tuber promoter in plants
In this study, in order to find the suitable tuber promoter, an experiment was conducted in Shahid Beheshti University in 2018. For this purpose, promoter sequences of different tuberous plants were searched at NCBI. Sequences were multiple-aligned and the target primers designed from conserved regions. PCR analysis confirmed the presence of the desired promoter in plants of sweet potato a...
متن کاملOn counting position weight matrix matches in a sequence, with application to discriminative motif finding
MOTIVATION AND RESULTS The position weight matrix (PWM) is a popular method to model transcription factor binding sites. A fundamental problem in cis-regulatory analysis is to "count" the occurrences of a PWM in a DNA sequence. We propose a novel probabilistic score to solve this problem of counting PWM occurrences. The proposed score has two important properties: (1) It gives appropriate weigh...
متن کاملThe Identification of the Modal Parameters of Orbital Machines using Dynamic Structural Approach
The researcher measured the least number of frequency response functions required for the identification of modal parameters, in order to simplify the identification of modal properties of such systems. In this work, the orbital machines are supposed to be a combination of orbital and non-orbital components. Structural Approach specified the identification of dynamic properties only to those ph...
متن کاملAdaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network
An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...
متن کاملProbabilistic Power Distribution Planning Using Multi-Objective Harmony Search Algorithm
In this paper, power distribution planning (PDP) considering distributed generators (DGs) is investigated as a dynamic multi-objective optimization problem. Moreover, Monte Carlo simulation (MCS) is applied to handle the uncertainty in electricity price and load demand. In the proposed model, investment and operation costs, losses and purchased power from the main grid are incorporated in the f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 10 شماره
صفحات -
تاریخ انتشار 2015